python页面加载完 | 您所在的位置:网站首页 › python爬虫懒加载以站长素材为例 › python页面加载完 |
我尝试了下面的代码,但它总是在得到大小之前返回。在# config.url = 'http://www.neimanmarcus.com/Stuart-Weitzman-Reserve-Suede-Over-the-Knee-Boot-Black/prod179890262/p.prod' import urllib2 import requests import config import time from lxml.cssselect import CSSSelector from lxml.html import fromstring print config.url headers = { "Host": "www.neimanmarcus.com", "Connection": "keep-alive", "Content-Length": 106, "Pragma": "no-cache", "Cache-Control": "no-cache", "Accept": "*/*", "Origin": "http://www.neimanmarcus.com", "X-Requested-With": "XMLHttpRequest", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36", "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8", "Referer": "http://www.neimanmarcus.com/Stuart-Weitzman-Reserve-Suede-Over-the-Knee-Boot-Black/prod179890262/p.prod", "Accept-Language": "en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4,fr;q=0.2,cs;q=0.2,zh-TW;q=0.2" } request = urllib2.Request(config.url, headers=headers) html = urllib2.urlopen(request) time.sleep(10) html = html.read() print html html = fromstring(html) sel = CSSSelector('option.addedOption') try: options = sel(html) print options except Exception as e: print e 如何获得整个页面的信息(尤其是启动大小)?在 |
CopyRight 2018-2019 实验室设备网 版权所有 |